Learning Tuple Probabilities

نویسندگان

  • Maximilian Dylla
  • Martin Theobald
چکیده

Learning the parameters of complex probabilistic-relationalmodels from labeled training data is a standard technique inmachine learning, which has been intensively studied in thesubfield of Statistical Relational Learning (SRL), but—sofar—this is still an under-investigated topic in the contextof Probabilistic Databases (PDBs). In this paper, we fo-cus on learning the probability values of base tuples in aPDB from labeled lineage formulas. The resulting learningproblem can be viewed as the inverse problem to confidencecomputations in PDBs: given a set of labeled query answers,learn the probability values of the base tuples, such thatthe marginal probabilities of the query answers again yieldin the assigned probability labels. We analyze the learn-ing problem from a theoretical perspective, cast it into anoptimization problem, and provide an algorithm based onstochastic gradient descent. Finally, we conclude by an ex-perimental evaluation on three real-world and one syntheticdataset, thus comparing our approach to various techniquesfrom SRL, reasoning in information extraction, and opti-mization.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Symmetry in Probabilistic Databases

Researchers in databases, AI, and machine learning, have all proposed representations of probability distributions over relational databases (possible worlds). In a tuple-independent probabilistic database, the possible worlds all have distinct probabilities, because the tuple probabilities are distinct. In AI and machine learning, however, one typically learns highly symmetric distributions, w...

متن کامل

Efficient probabilistic models for inference and learning

This project was concerned with enriching probabilistic models with structured knowledge representation. By a probabilistic model we mean any formalism that can be used to specify a complex probability distribution. For instance, a Bayesian network specifies a joint probability distribution over a tuple of random variables by means of a directed acyclic graph, in such a way that only the condit...

متن کامل

$k$-tuple total restrained domination/domatic in graphs

‎For any integer $kgeq 1$‎, ‎a set $S$ of vertices in a graph $G=(V,E)$ is a $k$-‎tuple total dominating set of $G$ if any vertex‎ ‎of $G$ is adjacent to at least $k$ vertices in $S$‎, ‎and any vertex‎ ‎of $V-S$ is adjacent to at least $k$ vertices in $V-S$‎. ‎The minimum number of vertices of such a set‎ ‎in $G$ we call the $k$-tuple total restrained domination number of $G$‎. ‎The maximum num...

متن کامل

Probabilistic and Prioritized Data Retrieval in the Linda Coordination Model

Linda tuple spaces are flat and unstructured, in the sense that they do not allow for expressing preferences of tuples; for example, we could be interested in indicating tuples that should be returned more frequently w.r.t. other ones, or even tuples with a low relevance that should be taken under consideration only if there is no tuple with a higher importance. In this paper we investigate, in...

متن کامل

Learning to Play Othello with N -Tuple Systems

This paper investigates the use of n-tuple systems as position value functions for the game of Othello. The architecture is described, and then evaluated for use with temporal difference learning. Performance is compared with previously developed weighted piece counters and multi-layer perceptrons. The n-tuple system is able to defeat the best performing of these after just five hundred games o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1609.05103  شماره 

صفحات  -

تاریخ انتشار 2016